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Abstract 

Population structure is l<nown to cause false-positive detection In association studies. We compared the power, precision, 
and type-l error rates of various association models In analyses of a simulated dataset with structure at the population 
(admixture from two populations; P) and family (K) levels. We also compared type-l error rates among models in analyses of 
publicly available human and dog datasets. The models corrected for none, one, or both structure levels. Correction for K 
was performed with linear mixed models incorporating familial relationships estimated from pedigrees or genetic markers. 
Linear models that Ignored K were also tested. Correction for P was performed using principal component or structured 
association analysis. In analyses of simulated and real data, linear mixed models that corrected for /(were able to control for 
type-l error, regardless of whether they also corrected for P. In contrast, correction for P alone In linear models was 
insufficient. The power and precision of linear mixed models with and without correction for P were similar. Furthermore, 
power, precision, and type-l error rate were comparable In linear mixed models Incorporating pedigree and genomic 
relationships. In summary, in association studies using samples with both P and K, ancestries estimated using principal 
components or structured assignment were not sufficient to correct type-l errors. In such cases type-l errors may be 
controlled by use of linear mixed models with relationships derived from either pedigree or from genetic markers. 
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Introduction 

The power of an association study depends on tlie phenotypic 
variance explained by the causal variant, the extent of linkage 
disequilibrium (LD) between the causal variant and the markers, 
and, not least, the size of the study sample. Several recent 
association studies have thus focused on collecting large samples to 
obtain higher powers of detection [1,2] but this practice is often 
associated with population stratification problems. Among the 
factors thought to cause lack of reproducibility in association 
studies reviewed [3], population stratification has probably been 
the most cited reason [4]. 

Population stratification refers to the inclusion of individuals 
from isolated subpopulations in the population of interest. In such 
a population, individuals from a subpopulation are, on average, 
more closely related to each other than to other individuals in the 
population as a whole. Population structure is common in nature. 
It manifests in the form of herds, colonies, and ethnic groups, and 
as a consequence of geographic isolation and natural or artificial 
selection [5] . A subde form of stratification can also occur at the 
family level, especially in livestock when animals are bred in full- 
sib or half-sib families [6]. 



Genetic association studies aim to correlate differences in trait 
level with differences in genotypes at a tested marker. This 
correlation is assumed to arise from the co-occurrence of the 
marker with the quantitative trait locus (QTL), i.e., LD. The 
presence of LD is usually equated with physical linkage. However, 
in a stratified population with differences at the trait level, markers 
with different frequencies across populations will also correlate 
with trait levels. Thus the use of a structured population in an 
association study may yield false associations due to differences in 
allele frequencies among subpopulations [7]. 

Examination of an unstratified sample is thus optimal in an 
association study [5] . However, combining samples from multiple 
populations is often necessary to increase the power of detection. 
In some cases, study of a structured population may be 
advantageous. Markers tightly linked to causal variants achieve 
similar significance in tests of association, making it difficult to 
distinguish them from the causal variant. In such cases, the use of a 
structured sample may enable precise localization of the causal 
variant, as a marker in strong LD with the causal variant in one 
population may not be in strong LD in the other [8] . Thus, the use 
of statistical methods that correct for stratification at the 
population and family level, especially in livestock are important 
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for the present and future association studies that aim to genotype 
large samples. 

Several methods, such as structured analysis (SA) [9], principal 
component analysis (PC A) [10] and the use of linear mixed models 
(LMM) [5], have been reported to be useful in correcting for 
stratification in genome-wide association (GWA) samples. It has 
been noted that correcting for stratification is more challenging 
when family structure is present along with population stratifica- 
tion [1 1] . In this study, we simulated an admixed population 
(variable ancestry from two populations) with familial relatedness 
as generally observed in commercial cattle populations and 
compared the power, precision and false-positive detection rates 
of analyses using the SA, PCA, and LMM approaches. We also 
applied these methods to two pubUcly available real datasets and 
compared the incidence of false-positive associations. 

Materials and Methods 

The study uses a simulated dataset and 2 pubHcly available 
datasets, so no ethical approval was required. 

Simulation Study 

An admixed population (Fig. 1) was simulated using the 
Q_MSim software package [12]. First, an historic population 
(HP) was simulated to generate the initial linkage disequilibrium 
(LD) and to allow for mutation (2.5 xlO~^/locus/generation) and 
drift. The HP consisted of 5,000 individuals that were randomly 
mated for 1,000 generations. The genome was assumed to have 20 
chromosomes of 100 cM each with 40,000 evenly distributed bi- 
allelic SNP markers with equal frequencies (0.5) for the two alleles 
in the base population. Only two QTLs with two equally frequent 
alleles were simulated, both located on chromosome 1 at 20 cM 
and 80 cM. The QTL mutation rate was 2.5 x 10~''/generation. 
The number of recombination per Morgan (per chromosome) was 
sampled from a Poisson distribution of mean — 1 and the cross- 
overs were randomly placed on the chromosomes. QTL effects 
were introduced in the last generation of the HP for a trait 
(mean = 0, variance =1) with a heritabifity of 0.3. The allelic 
effects were sampled from a gamma distribution with a shape 
parameter of 0.4 as implemented in Q,MSim [12], so that the two 
QTLs together explained 10% of phenotypic variance. The 
remaining genetic variance was simulated as a polygenic effect. 
After 1,000 generations, the population size was reduced to 2,500 
individuals (2,000 females, 500 males). 

Next, two populations were generated from the HP by mating 
two separate parent groups in isolation. The parent groups (1,000 
females, 50 males each) were randomly selected from the last 
generation of the HP. Each sire was mated to 20 dams, with each 
mating producing one offspring. This procedure yielded a total of 
1,000 offspring with an equal sex ratio (500 males, 500 females) 
from each parent group. Fifty randomly selected male offspring 
from each group were used as sires for the next generation, 
whereas all female offspring along with 500 randomly selected 
dams from the previous generation were used as dams for the next 
generation. This mating scheme was continued for 30 generations. 

After 30 generations, the average genetic distance between 
populations in 100 replicates, estimated as mean from all 
40,000 SNP markers, was 0.025 [standard deviation 
(SD) = 0.0006]. The allele frequencies of the markers (Fig. 2) and 
mean phenotype levels (Fig. 3) differed between populations. 

In the next step of the simulation, individuals from two isolated 
populations were mated to yield an admixed population. Twenty- 
five males and 500 females were selected from each population to 
serve as parents. These individuals were randomly mated for four 
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Figure 1. Schematic representation of tKie simulation. 
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Figure 2. Absolute differences in allele frequencies of single- 
nucleotide polymorphism (SNP) markers between populations 
(100 replicates). 
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generations to yield an admixed population of 1,000 individuals 
(500 males, 500 females), following the mating scheme described 
above. After admixture, QTLs in the last generation explained 0- 
13% of the total phenotypic variance. The individuals' ancestries, 
calculated as average genetic contributions from one parent 
population to individuals in the admixed population over four 
generations, are presented in Figure 4. Data on phenotypes for 
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Figure 3. Differences in mean phenotypes of two populations 
after separation for 30 generations (100 replicates). 

doi:1 0.1 371 /journal.pone.0088926.g003 

these 1,000 samples, along with their genotypes for 40,000 SNP 
markers, were used in our association analyses. 

Chromosome 1, which contained two simulated QTLs, was 
used to assess the power and precision of the models and five 
chromosomes (chromosomes 2-6) without simulated QTLs (nuU 
chromosomes) were used to assess the false-positive rate (type-I 
error). A total of 100 replications were used to compare the 
models. 

Publicly Available Real Data 

In addition to the simulated datasets, the models for GWA 
studies were applied to the following two publicly available 
datasets to assess their ability to control type-I error. 

Human GOLDN Dataset. This human stature dataset was 
collected from 1,315 European American individuals geno typed 
for 637 genetic markers as part of the Genetics of Lipid Lowering 
Drugs and Diet Network (GOLDN) study [13,14]. Individuals' 
height, sex, and age were recorded. Previous studies have found no 



significant population structure in this sample and no significant 
association between height and any of these genotyped markers. 

Dog Data on Hip Dysplasia, This dataset contains hip 
dysplasia data from 292 dogs of two breeds (Labrador retriever 
and greyhound) and their crosses (Fi, Fj, and backcrosses). The 
samples were genotyped for 23,500 SNPs and hip dysplasia was 
measured using the Norberg angle [15]. The relationship matrix 
was estimated from pedigree. The ancestors of each dog were 
traced back as far as possible. We used 1,000 randomly chosen 
markers assumed to be unlinked to the variants controlling hip 
dysplasia [14]. 

Association Models 

The allele substitution effect (/») of an SNP was estimated by 
successively fitting the marker in the following models. The 
simulated data contained two confounding factors: admixture (Pj 
and familial relatedness [K). The models compared corrected for 
none, one, or both factors. 

The significance of estimated b was then tested against the null 
hypothesis {b = 0) using a t-test with Bonferroni correction for 
multiple testing, at a significance level of a = 0.05. All analyses 
were carried out with the DMU software package [16]. 

1) Linear Model (LM). A linear model (LM) ignoring A" and P 
was used to estimate the allele substitution effect: 



y = ljl + hg + e 

where y was a vector of phenotypes, fi was an overall mean, 1 
was a vector of ones, b was the allele substitution effect of the 
SNP, g was a vector of additively coded (0, 1 and 2) SNP 
genotypes and e was a vector of random residuals with normal 
distribution N(0,la^), where is the error variance and / is 
the identity matrix. 
2) Linear Mixed Model Including Pedigree-Based Re- 
lationship (LMMped). An LMM accounting only for K 
was used to test the associations of single SNPs with the 
phenotype: 

y = Hl + bg+ii+e 
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Figure 4. Distribution of individuals' ancestries after admixture 
(100 replicates). 

doi:1 0.1 371 /journal.pone.0088926.g004 



The terms in the model are similar to those in the LM, with 
the addition of random polygenic effects [u). K was included 
in the model as the variance-covariance structure {A) of «, 
estimated from the pedigree relationships. « was assumed to 
be multivariate normally distributed H~N(0,iT^y4), where 
is the genetic variance and A is the additive genetic 
relationship matrix derived from pedigree records for the last 
three generations. 

3) Linear Mixed Model Including Genomic Relation- 
ship (LMMgmat). This LMM was similar to the LMMped 
accounting for K, differing only in the estimation of the 
variance-covariance matrix of «. Instead of a pedigree-based 
relationship matrix {A), genome-wide SNP markers were used 
to estimate the genomic relationship matrix (G) [17]. u was 
assumed to be multivariate normally distributed 
h~N(0,(t2G). 

4) Structured Association. A model based on the clustering 
method of the STRUCTURE software package [9] was used 
to correct for P. One thousand markers evenly distributed 
over the genome were selected and individuals were assigned 
to two clusters (^=2) in a structure hnkage model [18]. 
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Admixture was assumed and partial membership to a cluster 
was allowed. The membership of an individual in a cluster (s, 
varying continuously between 0 and 1) was used as a covariate 
in the following models to correct for P. 

1) 4a) Linear Model with STRUCTURE (LMstr). This 
model was similar to the LM, with additional fixed 
regression (c) of the phenotype on s to account only for P 
while ignoring K: 

y = Hl-\-bg-\-cs-\-e 

2) 4b) Linear Mixed Model with STRUCTURE 
(LMMstr). This model was similar to LMMped, with 
the addition of fixed regression of the phenotype on s to 
correct for K and P: 

y = Hl-\-bg-\-u-\-cs-\-e 



5) Principal Component Analysis (PCA). PC A was per- 
formed to correct for P in the sample. Principal components 
(PCs) for the covariance matrix, estimated from all 40,000 
SNP markers, were calculated. SNP genotypes and pheno- 
types wpre < orrected for the first two PCs, as described 
elsewhere [10]. 

5) In the simulated datasets on an average the top two PCs 
explained 2.23% of the total variance in the range of 2.05 to 
2.62. Although the fraction of the total variance explained was 
low, in most cases one of the first two PCs for the simulated 
datasets (Fig. SI) or the first PC for the dog dataset (Fig. S2) 
showed the strongest correlations with known ancestrie.s and 
separated samples on ancestries. We also found that in most of 
the cases the correlation between PCs and known ancestries 
decayed rapidly beyond the first two PCs (Fig. S3). Thus, the 
first two PCs were used to correct for P. Although PCs did not 
separate the human samples, two PCs were used to correct the 
genot)'pes and phenotvpes for comparison. The corrected 
genotypes ig*) and phenotypes (y*) were fitted in the following 
models to correct for P. 

1) 5a) Principal Component Analysis in a Linear 
Model (LMpca). Here, y* and g* were fitted in the LM 
to account only for P while ignoring K\ 

y* = Hl-\-bg-\-e 

2) 5b) Principal Component Analysis in a Linear 
Mixed Model (LMMpca). Here, y* and g* were fitted 
in LMMped to correct for K and P. 

y* =nl+bg+u+e 



Comparison of the Models 

Type-I Error. Simulated Data. SNPs on five null chromosomes 
(with no simulated QTL) were used to assess the type-I error rates 
of all described models. A total of 1,000,000 SNPs over 100 
rephcates were fitted into the models to estimate b. The 
distribution of — logio /'-values of the t-test was plotted against 
the null (uniform) distribution in a quantile-quantile (QQ) pl"t- 
Deviation from the nuU distribution was used to assess the t}pe-I 
error rate. In addition, the number of observed significant SNPs 
on null chromosomes was compared with the expected number at 
different significance levels. 

Real Data. The human data were analyzed with LM, 
LMMped, LMMpca (first two PCs), and LMMstr (k=2), with 
the inclusion of sex, age, and squared age as covariates [14]. 
Previous studies have reported no significant association between 
the SNPs and height. The deviation of — logio /(-values of the /-test 
from the expected uniform distribution in a QQ^plot was used to 
assess the type-I error rate. 

The data on hip dysplasia in dogs were fitted to the LM, 
LMMped, LMMpca (first two PCs), LMMstr (A: =2), and 
LMMpedB, a model similar to LMMped but with the addition 
of breed (as proportion of Labrador retriever) as a fixed regression. 
The 1,000 randomly chosen markers were assumed to be 
unassociated with the causal variants for hip dysplasia. The 
distribution of — logio/i-values of the test of association was 
compared with the expected distribution under the assumption of 
no association in a QQ, plot. Deviation from the expected 
distribution was used to assess the type-I error rate. 

Power and Precision. The power of the models to detect the 
QTL was assessed from chromosome 1, with 2 simulated QTL, 
i.e., for 199 QTLs over 100 rephcates (one QTL fixed for an allele 
was left out of the analysis). A QTL was considered detected when 
an SNP within 1 cM of the simulated QTL showed significance 
after Bonferroni correction (for 12,000 SNPs tested on six 
chromosomes) at a = 0.05. The absolute distance between the 
detected and simulated QTL positions was calculated to ascertain 
the precision of the methods in locating the QTL. Power and 
precision were estimated separately for small effect (explaining 
<5% oi ffp) and large effect QTL (explaining >5% of cr^). 

Results 

Results from the simulation study 

Type-I Error. The distributions of — logm /-valuesfor the 
SNPs on the five null chromosomes were compared with expected 
uniform distributions (Fig. 5). In general, type-I error rates were 
lower in LMMs, where familial relationship (K) was used to model 
the covariance structure of the random individual effect than in 
LMs (which did not account for K). Among LMMs, LMMpca 
closely followed the expected type-I error rate, whereas LMMgmat 
was the most conservative, showing a lower than expected type-I 
error rate. 

The average number of significant SNPs on the null chromo- 
somes is compared with expected numbers under the nuU 
hypothesis at difierent significance levels in Table 1. At a 
Bonferroni-corrected significance level, a = 0.05 (corrected for 
10,000 SNPs tested on five chromosomes) LMs that did not 
account for K (LM, LMstr, and LMpca) showed significantiy 
higher false-positive results than expected (/)<0.01), whereas 
LMMs that corrected for K (LMMped, LMMstr, LMMpca, and 
LMMgmat) showed better control of false-positive results. 

Power. The power of the models to detect two simulated 
QTLs on chromosome 1 using 100 replicates with 199 QTLs is 
summarized in Figure 6. Power was compared only among LMMs 
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Figure 5. Quantile-quantile plot of — log^oP-values for associ- 
ation tests using different models in the simulated dataset. 

doi:1 0.1 371 /journal.pone.0088926.g005 

because the LMs had very high false-positive detection rates 
(Fig. 5). No LMM had the power to detect small-effect (<5% of 
(T^) QTLs. The power was ~5%, at the level expected with 
a = 0.05. In contrast, all models tested showed 53-60% power to 
detect large-effect (explaining >5% of ct^) QTLs. LMMped and 
LMMstr had the highest powers (59.6%), followed by LMMpca 
(57.6%) and LMMgmat (53.5%), although this difference was not 
significant [x2 (3, n = 99) = 0.9925; j& = 0.8031]. Overall, LMMped 
had the highest power, detecting 32% of simulated QTLs, 
followed by LMMstr (31.7%), LMMpca (30.7%), and LMMgmat 
(28.1%); however, this difference was also not significant [x^ (3, 
n= 199) = 0.8983; /) = 0.8258]. 

Out of the total 199 simulated QTLs, 53 were detected by all 
die four models, the majority (50) of which being large effect QTL 
(mean genetic variance = 0.087, SD = 0.02). On the other hand 



1 33 QTLs were not detected by any of the models, majority (95) of 
them being small effect QTL (mean genetic variance = 0.031, 
SD = 0.032). 

Precision. The precision of the models, measured as the 
absolute distance (cM) between the positions of simulated and 
detected QTLs, is described in Figure 6 and Table 2. All four 
methods compared showed greater precision for the localization of 
large-effect compared with small-effect QTLs. On average, large- 
effect QTLs were localized within 0.36-0.39 cM of the simulated 
QTL positions, whereas small-effect QTLs were localized within 
0.62-0.66 cM. LMMped was most precise (±0.62 cM) for small- 
effect QTLs, and LMMgmat was most precise (±0.36 cM) for 
large-effect QTLs. 

Results from the Published Data 

Human GOLDN Dataset. For the Human GOLDN dataset, 
tyi3e-I error was compared among the LM, LMMped, LMMstr, 
and LMMpca. The distribution of test statistic is given in a QQ^ 
plot (Figure 7). The LM without correction for A" and /"showed the 
highest false-positive rate. The three LMMs (LMMped, LMMstr, 
and LMMpca), showed similar distributions of —log 10 /(-values, 
which were slightly higher than expected. 

Dog Data on Hip Dysplasia. The dog data were analyzed 
using the LM, LMMped, LMMpedB, LMMstr, and LMMpca. 
The type-I error rates for these methods are compared in a QQ^ 
plot in Figure 8. The LM showed the highest rate of type-I errors, 
and this rate was comparable among LMMs, with LMMpedB and 
LMMstr showing the lowest type-I error rates. 

Discussion 

We compared the power, precision and false-positive association 
rates of several GWA models in analyses of simulated admixed 
data with familial relationships. LMs that did not model K (LM, 
LMstr, and LMpca) showed higher rates of false-positive 
associations than did LMMs accounting for K (Fig. 5, Table 1). 
Regardless of whether they corrected for population stratification, 
LMMs controlled for false-positive results at the nominal level. 
This is probably due to the capture of the ancestry of individuals 
by the variance-covariance structure modeled into the random 



Table 1. Average number of significant single-nucleotlde polymorphisms (SNPs; 100 replicates) in five chromosomes (10,000 
SNPs) without simulated quantitative trait loci. 



Model 




Significance level 










Correction for 


0.05 


0.005 


0.0005 


0.000005 


LMMped 


K 


514.73* (7.83) 


53* (1.7) 


5.5 (0.4) 


0.02 (0.014) 


LMMstr 


P+K 


511.88 (7.9) 


51.97 (1.6) 


5.64* (0.4) 


0.02 (0.014) 


LMMpca 


P+K 


520.89** (7.6) 


53.92** (1.6) 


5.88** (0.4) 


0.02 (0.014) 


LMMgmat 


K 


472.15 (5.7) 


44.92 (1.2) 


4.45 (0.3) 


0.01 (0.01) 


LM 




1014.51** (18.9) 


190.2** (6.8) 


36.93** (2.0) 


1.43** (0.17) 


LMstr 


P 


998.22** (17.8) 


184.33** (6.4) 


34.78" (1.8) 


1.34** (0.16) 


LMpca 


P 


968.01** (18.7) 


175.27** (6.5) 


32.01** (1.7) 


1.17** (0.16) 


Expected false-positive associations 




500 


50 


5 


0.05 



Standard errors are given in parentheses. The average number of significant SNPs (Sobs) in 100 replicates was compared with the expected number (Sexp) at different 
significance levels using f-tests. (Hq: Sobs = %xp' Hi : Sobs>Sexp; *p<0.05, **p<0.01 ). Significance level of 0.000005 corresponds to a nominal significance level of 0.05 after 
Bonferroni correction for 10000 tests. 

LMMped = Linear Mixed Model Including Pedigree-Based Relationship, LMMstr = Linear Mixed Model with STRUCTURE, LMMpca = Principal Component Analysis in a Linear 
Mixed Model, LMMgmat = Linear Mixed Model Including Genomic Relationship, LM = Linear Model, LMstr = Linear Model with STRUCTURE, LMpca = Principal Component 
Analysis in a Linear Model, P = admixture, K = Familial relationships. 
doi:1 0.1 371/journal.pone.0088926.t001 
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Figure 6. The power [% of quantitative trait loci (QTLs) detected] and precision (absolute distance between simulated and detected 
QTL; gray bars) of the models in QTL localization. 
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polygenic effect, as previously shown [5]. The results of analyses of 
publicly available human and dog data also show that LMMped 
was comparable to LMMstr and LMMpca in controlling for type-I 
errors (Figs. 7 and 8). Thus, in a sample with structure at the 
family and population levels, correction for K alone might be 
sufficient to control for false-positive associations arising from 
population structure. 

In contrast, correction for P alone in the LMpca and LMstr was 
insufficient to control for false-positive associations. This is 
probably because STRUCTURE [9] and PCA [10] corrected 
for P, but subtle admixture at the family level was not captured by 
STRUCTURE (k=2) and LMpca (two PCs). Thus our findings 
indicate that correction for structure at the population level alone 
is not sufficient in samples with structure at the population and 
family levels. In such samples, LMMs can be effectively used to 
control for false-positive results. 

Among LMMs, LMMgmat was the most conservative and 
consequently suffered a slight loss of power. In this model, the 

Table 2. Absolute error (cM) in quantitative trait loci 
localization. 





Small effect 


Large effect 


All 


LMMped 


0.62 


0.38 


0.40 


LMMstr 


0.66 


0.39 


0.41 


LMMpca 


0.66 


0.38 


0.40 


LMMgmat 


0.66 


0.37 


0.38 



Precision is given as the absolute genetic distance between simulated and 
detected quantitative trait loci (±1 cM). 

LMMped = Linear Mixed Model including Pedigree-Based Retationsiiip, 
LMMpca = Principal Component Analysis in a Linear Mixed Model, LMMstr = Linear 
Mixed Model with STRUCTURE, LMMgmat = Linear Mixed Model Including Genomic 
Relationship. 
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relationship matrix (G) was estimated from the whole genome, 
including markers linked to the simulated QTL. This probably led 
to correction of part of the QTL effects in the data, thereby 
reducing the model's power to detect QTLs. 

The compared models had powers of ~60% in detecting large- 
effect QTLs. In our simulation, the same set of QTLs were 
segregating in the two ancestral populations, which might not 
accurately reflect the situation in real populations, where 
subpopulations may have different causal variants affecting the 
same trait, thereby reducing the power to detect such QTLs. Thus, 
our power results likely reflect the upper boundary of detection 
power in structured populations for QTLs with similar effects 
using the same sample size. In our study, none of the tested 
methods had the power to detect smaU-eflFect QTLs in the 
simulated dataset. This finding might be explained by a lower 
average LD (r^ = 0.07) between adjacent markers and a similar LD 
between markers and the QTL. Increasing the marker density and 
sample size might increase the power to detect small-effect QTLs 
in such a dataset. 

It should also be noted that the LD levels in the dataset is 
defined by the specific parameters (mutation rate, recombination 
rate and type of mating) used in the simulation. The mutation rate 
for markers assumed in this study was higher than what is observed 
in real life (e.g. 10"" per base per generation in human. The 1000 
Genomes Project Consortium. 2010 [21]). However, we had only 
2000 marker per chromosome (100 cM~100 MB). Therefore, the 
mutation rate considered in the study reflects the chance of having 
a mutation for a genomic region the SNP represents. Altering the 
genetic parameters may change the LD structure; however we do 
not expect a change in the relative ranking of the methods in the 
power of QTL detection; the same is true for the type-I error rate. 
Moreover, we use chromosomes devoid of the simulated QTL to 
assess the type-I error rate; hence the false associations arise only 
due to stratification and not because of the LD between markers 
and the QTL. Our study focuses on quantitative trait, but the 
findings can be generalized to case-control data as reported by 
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Figure 7. Quantile-quantile plot of — logio/>-values for association tests of the human GOLDN dataset using different models. 
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Figure 8. Quantile-quantile plot of — iogioA>-vaiues for association tests of the dog hip dysplasia dataset using different models. 
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Kang et al., 2010 [19]; who found variance components based 
method to be eflective in correcting for stratification. 

The models showed comparable precision in QTL localization. 
Large-effect QTLs were localized with higher precision than were 
small-effect QTLs. This result contrasts with that of MacLeod et 
al. [20], who found that small-effect QTLs were mapped more 
precisely than large-effect QTLs. This difference is probably due 
to sampling bias, as the models in our study had no/very low 
power to detect small-effect QTLs, but ~60% power for large- 
effect QTLs; this situation allowed comparison of only a few 
observations. 

In our simulation, we found that in most cases (86) the first 
principal component captures the genetic variation due to 
population stratification (correlation of 0.77 with ancestry; see 
Fig. SI). However, in nine cases the second PC, in three cases the 
third PC and in two cases the fourth PC had showed the highest 
correlation with the ancestries (0.67, 0.58, and 0.52, respectively). 
These observations suggest that in a dataset with structure at both 
population and family level, the latter may account for more 
genetic variation than the former. Therefore correction for only 
the first two PCs might not always be sufficient to correct for 
population stratification. It may thus be important to find out 
which PCs explain the cryptic relationship better in each dataset 
and include those to correct population stratification. 

Our findings show that LMM approaches accounting for K are 
able to correct for structure at the family and population levels. 
We also found that the power and type-I error rate of LMMgmat 
with G were comparable to those of LMMped with A. Thus, 
LMMs may be used to correct for structure at the family and 
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